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ABSTRACT 

The pacer points out that, in. working with special 
qroups, cortrlations are often distorted because the variability of 
the measures being correlated are restricted in tpe groups. Presented 
is a formula whereby a Pearson product-moment correlation can be 
corrected for restrictions in range it, situations vtore the basis of 
selection is unmeasured, but where the extent of restriction for each 
of the two measures beino correlated is known, arid where the 
variables tre assumed to be r.otmailv distributed in tie population. 
Three examples of the use of the formula ure given: in a case where a 
comparison is to be male between a value derived from an unrestricted 
sample and one derived from a restricted sample: a case when a 
correlation is obtained or a special restricted sample and must be 
generalized to the population; and in estimating the validitv of a 
test, where the criterion and the tes + scores are available on the 
same individuals only in a restricted sample where the basis of the 
selection is not cleat or not measured. (Author/ v N) 
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ABSTRACT 



In working with special groups, correlations are often distorted be- 
cause the variability of the measures being correlated are restricted in 
the groups. The formula presented In this paper can be used to correct 
product-moment correlations for this distortion even when the basis of the 
restriction Is unknown, 
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CORRECTING CORRELATIONS FOR RESTRICTIONS IN RANGE 
DUE TO SELECTION ON AN UNMEASURED VARIABLE* 

II. Dale Bryant 

Teachers College, Columbia University 

Sunanda Gokhale 
Albany, New York 

The size of a correlation coefficient Is dependent In part upon the vari- 
ability of the measured values In the correlation sample. Any tine that a sam- 
ple Is restricted In range on either or both of the measures, the correlations 
between those two measures will tend to be lowered as compared to the sane cor- 
relation based upon a representative sample of the population. If prediction 
within the restricted sample Is the purpose of the correlation, then the obtained 
value Is the meaningful and correct one. However, If, for some reason, it Is not 
possible to correlate the variables using an unrestricted sample, we can Infer the 
relationship between the two measures Irrespective of the restriction If we cor- 
rect the correlation for the effect of the restriction In range. For example, 

If, In a sample of bright students, reading achievement and academic grades show 
only a .2 correlation, we cannot Infer that this Is the general relationship be- 
tween reading and school grades. Since a high IQ group will tend to make high 
grades and will also tend to be high on reading ability, there Is likely to be 
severe restriction In range on both variables. For prediction within the high IQ 
group, the ,2 correlation Is appropriate, but to Infer beyond the sample, a cor- 
rection for restrictions In range Is necessary, Guilford (1965, pp, 341-345) 
gives three formulae, attributed to Karl Pearson, to correct a Pearson product- 
moment correlation coefficient for restriction In range when restriction results 
from selection on one of the two variables being correlated or on some measured 

*The work presented or reported herein was performed pursuant to a 
grant from the U.S. Office of Education, Department of Health, Education 
and Welfare. 



third variable. The assumption must be made that the variables are normally 
distributed In the population. 



PROBLEM 

In many clinical and other settings, the sample Is obviously restricted 
In range on different variables, but the basis for the restrictions (1,e., 
the selection variables) Is complex, unknown, or unmeasurable. Examples of 
such sampling might be children coming to a particular clinic, cases receiving 
a particular diagnosis, or Individuals exhibiting a particular behavior. In 
all these cases, the samples may show restrictions In range on variables belr.g 
correlated, but the basis of the restrictions cannot be reduced to a measurable 
variable. In these Instances, the formulae presented by Guilford cannot be used. 
It is possible, however, to correct for restrictions In range, even though the 
selection variable Is unknown or unmeasured, by using Information about the 
extent of the restriction on each of the tv/o variables being correlated. 

This paper presents a formula whereby a Pearson product-moment correlation 
can be corrected for restrictions In range for these special but very frequent 
situations where the basis of selection Is unmeasured but where the extent of 
restriction for each of the two measures being correlated Is known and where the 
variables are assumed to be normally distributed In the population. 

FORMULA FOR USL WHEN RESTRICTIONS RESULT 
FROM COMPLEX OR UNMCASURLO VARIABLES 

Starting with Guilford's formula for correcting r^ for restriction In 

range, we can rewrite his Formula li so that It corrects a correlation r^ , 

where restriction is produced by selection on the basis of variable 3 and there 

Is knowledge of the standard deviations for variable 1 In both the restricted 

and unrestricted samples. Similarly, we can rewrite his Formula t so that It 

corrects a correlation r 1t , where restriction is produced by selection on the 

o J r; 
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basis of variable 3 and there Is knowledge of the standard deviations for 
variable 3 In both the restricted and unrestricted groups, by equating these 
two formulae and squaring and simplifying them, we can obtain an equivalent 
value for the ratio of unrestricted to restricted variances on variable 3, 
expressed In terms of the ratio of unrestricted to restricted variances on 
variable I and the correlation r-^. The same procedure can be followed by 
rewriting Formulae I and II to correct r^ so as to obtain an equivalent value 
for the ratio of unrestricted to restricted variances on variable 3, expressed 
In terms of the ratio of unrestricted to restricted variances on variable 2 
and the correlation r . Thus, the Information about restriction on variable 

*'fc 

3 Is expressed In terms of Information about the variables 1 and 2 and the 
correlations r^ and r^. 

These equivalent ratio values described above can be substituted Into 
Guilford’s Formula III (for R^), where restriction Is produced by selection 
on the basis of variable 3 and there Is knowledge of the standard deviations 
for variable 3 In both the restricted and unrestricted groups and where r^ 3 
and r^ are known. However, since there are two estimates of the ratio of 
unrestricted to restricted variances on variable 3, we must express the value 
as the square root of the product of the two estimates (viz., a »\/axa). 

The resulting formula for the corrected correlation (k.*) Is given below: 
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This formula does not require all of the Information necessary for Guilford's 
Formulae I, 11, and 111, but It can be used to obtain a product-moment cot rela- 
tion coefficient that Is corrected for restrictions In range (R^) knowing only 
the uncorrected correlation (r^), the standard deviations of the two variables 
In the restricted samples (s^ and s ? ), and the standard deviations of the two 
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variables In the unrestricted sample ((T^ and 0^ ) . 



EXAMPLES OF USE OF THE FORMULA 



In a clinical sample of children, It was noted that a particular measure 
(the Coding subtest on the Wechsler Intelligence Scale for Children) was con- 
sistently lower t h an the average of the other Intelligence subtests. The sample 
consisted of children of average or above averabe IQ who were brought by their 
parents to a clinic because school remedial procedures were not correcting the 
children's severe reading retardation. To study the nature of this lowered per- 
formance, the variable, Coding, was correlated with other reference variables such 
as the Perceptual Speed Test ov the Primary Mental Abilities Test battery, The 
correlation of a reference variable and the Coding subtest needs to be compared 
to equivalent values In a sample representative of the population as given In other 
research studies. In order to make the correlation based upon the clinical sample 
comparable to the correlation based upon the sample representative of the popula- 
tion, It Is necessary to correct for restrictions In range, since both Coding and 
the reference variable, Perceptual Speed, show consistently lower scores than are 
normally found In a presumably representative sample from the population. The 
specific factors responsible for the restriction In range cannot be measured, since 
coming to a cllr.lc Involves much nore than poor reading, In both Coding and 
Perceptual Speed, we can assvne normality of distribution within the population. 

The values obtained for the clinic sample arc as follows: r^ * .40, where 
1 and 2 represent Coding and Perceptual Speed respectively} s* • 2.59 and s* • 

186, where s* Is the variance for the clinic sample. Equivalent values for norma- 
tive samples of appropriate age as given In the manuals for the respective tests 
are (f ? • 9 and (T | ■ 289, where (T * Is the variance based upon the normative sam- 
ples. Substituting In the final formula given above: 
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A study, based upon a "normal" sample of eighth grade children (which Is 
roughly comparable to the grade placement of the clinic sample) and having var- 
iances similar to the population values, reported that r a ,37. 

By using the correction for restrictions In range, It Is possible to com- 
pare the ,68 In the clinic sample with the ,37 In the normal sample. It suggests 
that there Is a higher degree of relationship between these two measures In the 
clinic sample (and confirms certain conclusions drawn from clinical observation). 
While It Is beyond the scope of this paper to comment upon the Interpretation of 
this finding, It Is apparent that Interpretations could be made that could not 
have been made If there had been no correction for restrictions In range. 

The illustration above Is of a case where a comparison Is to be made between 
a value derived from an unrestricted sample and one derived from a restricted sam- 
ple. The values have to be expressed In comparable terms, so the correction for 
restrictions Is necessary. 

Another example of a case where the correction for restrictions In range 
is necessary Is when a correlation Is obtained on a special, restricted sample 
and must be generalized to the population. An example of this might be a study 
of the relationship between the amount of a particular chemical In the blood and 
the frequency of hal luclnatory-t/pe activity. Since this Is hard to study In a 
noncllnlcal population, we might study It In a sample of Individuals diagnosed 
as schizophrenic. If schizophrenics seldom have a low concentration of the chem- 
ical In their blood and If they tend to show more frequent hallucinatory- type 
activity than would be true for the total population, then both of these variables 
are restricted In range. A correlation between the two variables In the schizo- 
phrenic sample can be used to Infer what the relationship would be In the total 
population If It Is assumed that the same relationship holds true for lower levels 
of the chemical and less frequent hallucinatory-type activity and that the cl 1 n- 
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leal sar > merely represents one end of a distribution on these two variables ? 
which ar<. normally distributed In the population. While these assumptions might 
not be justified, It Is evident that, If they are made, the correlation based 
upon the s izophrenlc sample would have to be corrected for restrictions In 
range 1i order to Infer the relationship In the population, The basis of the 
selection o' the sample Is complex, and, unless a measure of the selection var- 
iable r? ■ ce obtained, It would be necessary to use a formula such as the one pre- 
sented 1i ' ils paper. 

Anothf example of the application of the formula would be Its use In esti- 
mating t 1 o validity of a test where the criterion and test scores are available 
on the sai a individuals only In a restricted sample where the basis of the selec- 
tion Is net clear or not measured. If the variance of the test Is known for some 
sample tha* Is representative of the population and the variance of the criterion 
Is known ' r some other sample representative of the population, the formula can 
provide t orrectlon to estimate the validity of the test In an unrestricted 
sample. 
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SUMMARY 

There are many times that Pearson product-moment correlations are based 
on clinical samples or other special groups where there are restrictions In 
range on the variables being correlated and where the basis of the selection 
that causes the restrictions Is unknown or unmeasured. It Is often necessary 
either to compare the correlation with values derived from a sample represen- 
tative of the population or to Infer from the special sample the nature of the 
relationship that exists between the two variables within the total population. 
In such cases, If the assumption can be made that the variables are normally 
distributed In the population, the formula presented In this paper Is applicable 
In correcting the correlation coefficient for restrictions In range. 
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FOOTNOTE 

hn kindly checking this derivation, Ur. Rosedlth Sltgreaves, Principal 
Advisor, Educational Research and Statistical Methods Area, Psychology Department, 
Teachers College, Columbia University, pointed out that the formula could be ob- 
tained somewhat more directly without recourse to the Guilford formulae. The 
senior author will be happy to send upon request both the original and Ur. Sit- 
greaves' derivations to anyone requesting them. 
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